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ABSTRACT 

The purpose of this study was to assess the effects 
of correlate^ dimensions and differential ability on one dimension on 
parameter estimation when using a two-dimensional item response 
theory model. Multidimensional analysis of simulated two-dimensional 
item response data fitting the M2PL model of M. D. Reckase (1985, 
1986) was conducted using the MIRTE analysis program. Six data sets 
(2,000 ability vectors by 104 items) were generated to satisfy two 
conditions of the distributions of the ability dimensions and three 
different degrees of correlation between two abilities. The six data 
sets (two distributions times three correlations) and analyse? were 
replicated 100 times each. Summary statistics on the 100 repli:ations 
were used to assess the effects of the degree of correlation between 
ability dimensions and differential ability on the second dimension. 
Results indicate that the MIRTE program recovers the structure of a 
multidimensional correlated space better than do previous estimation 
progrcuns, especially in the cases in which l,he items were 
multidimensional in themselves. However, the MIRTE program tended to 
underestimate the degree of correlation between the ability 
dimensions, but it did not force orthogonality on the dimensions. 
Because of the limitations imposed on any single body of research in 
terms of research design, some alternative situations need to be 
studied. Future investigations should assess the accuracy of 
estimation procedures when a guessing parameter and different latent 
space structures are included. (TJH) 
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Abstract 

The purpose of this studv was to assess the effects of correlated 
dimensions and differential ability on oue dimension on parameter estimation 
when using a two-dimensional IRT model. Past research has shown the 
inadequacies of unidimensional analysis of multidimensional item response 
data. However, few studies have reported mu' dimensional analysis of 
multidimensional data and, in those which used simulated data, results were 
usually based on one replication 

Multidimensional analysis of simulated two-dimensional item response 
data fitting the M2PL model of Reckase (1985a, 1985b, 1986) was done using the 
analysis program, MIRTE (Carlson, 1987) 

Six data sets (2000 ability vectors by 104 items) were generated tc satisfy 
two conditions of the distributions of the ability dimensions and three different 
degrees of correlation between the two abilities The six data sets (2 
distributions x 3 correlations) and analyses were replicated 100 times each 
Summary statistics on the iOO replications were used to assess the effects of 
degree of correlation between ability dimensions and differential ability on the 
second dimension 

With the exception of the discrimination parameter on the second 
dimension and the multidimensional discrimination parameter, ability and item 
parameters were adequately recovered in the data sets in which both abilities 
were normally distributed over the full range In the data sets with a restricted 
range of ability on the second dimension, recovery or the ability and item 
parameters was adversely affected. As the correlation between the dimensions 
increased and there was less ability on the second dimension, the dimensions 
appeared to become less distinguishable The latent space seemed to be 
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collapsing into a more unidimansional space when the ability dimensions were 
cor related 0.5G 

Results indicate that MIRTE recovers the structure of a multidimensional 
correlated space better than previous estimation programs have acne, especially 
in the cases m which the items were multidimensional in themselves Because 
of the limitations imposed on any single piece of research in terms of research 
design, some alternative situations need to be studied There remains further 
investigation to be done on the accuracy of estimation procedures when there is 
inclusion of a guessing parameter as well as with different latent space 
structures bjth in terms of popuiat.on and items 
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Theoretical Framework 

The original Item Response Theory (IRT) models were based on the 
assumption of umdimensionaiity (i e , only one ability was required to correctly 
reopona to all the items). When more than one ability accounts for test 
performance, the test is multidimensional and a Multidimensional Item Response 
Theory (MIRT) model is lequired to accurately fit the data 

Consider the situation in which items for a test are designed to measure 
one abihty (e g., mathematics) but require some amount of a second ability (e g , 
verbal) in order to respond correctly This second, required ability could be 
more crucial to success for some examinees than others. Students of English as 
a Second Language (ESL) may have sufficient mathematics ability but lack the 
required amount of verbal ability in order to make a correct response This 
could be described as a situation in which mathematics ability is distributed 
normally over a full range but verbal ability is distributed normally with a 
lower mean over a narrower range It is reasonable to assume the two abilities 
are correlated to some extent What happens to ability estimates for the ESL 
students if a MIRT model used to fit their responses? How are the ability 
estimates affected by degree of correlation between the abilities? 

Several authors (e g , Ackerman, 1987, Ansley & Forsyth, 1965, Bogan 
Yen, 1983, Dorans & Kingston, 1985, Drasgow & Parsons, 1983, McCauley & 
Mendoza, 1985, McKinley A Reckase, 1984, Reckase, 1979, 1985b, Reckase, 
Carlson, Ackerman, k Spray, 1986) have considered the effects of anal'^zing 
known multidimensional data with a umdimensional item response model. The 
resulting estimates in most cases were not acceptable unless there was clearly 
one dominant dimension Ansley and Forsyth (1985) reported that the 
unidimensional ability estimates were most highly related to the average of the 
multidimensional abihties In the hypothetical educational situation described 
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above, this would be unacceptable if students with high mathematics ability but 
low vti fc^l abihty were penalized in placement or selection procedures Reckase 
et al (1986) found that thu unidimensional ability estimates establis.-^ed from 
multidimensional data had different interpretations at different points on the 
unidimensional ability scale By and large, the resulting unidimensional 
estimates from multidimensional data have been difficult to interpret and have 
not reflected well the original characteristics of the data 

In spite of findini^a that unidimensional models are not often robust to 
multidimensionahty, few researchers have made use of multidimensional 
models to analyze multidimensional data There are good reasons for this 
Although MIRT models are being developed and tested, they are more complex 
than their unidimensional counterparts Analysis of multidimensional data 
with multidimensional programs is expensive in terms of computer time. Few 
multidimensional analysis programs exist and none has undergone exhaustive 
testing. Only two programs have been readily available (l) TESTFACT (Wilson, 
Wood, % Gibbons, 1984), and (2) MAXLOC (McKinley & Reckase, 1983b) 
TESTFACT has been deemed inappropriate by some researchers because it uses a 
linear factor analytic procedure to describe the non-linear IRT relationship, a 
particularly contentious procedure with multidimensional data (Ansley, :934, 
Lord, 1980, McDonald 8t Ahlawat, 1974, R L McKinley, personal 
communication, November 13, 1986) MAXLOG was written to provide 
parameter estimates for uncorrected abilities Results of pilot testing of a third 
multidimensional analysis program, MIRTE (Carlson, 1987), indicate that it 
estimates item parameters and abilities more efficiently and more accurately 
than MAXLOG and it can accommodate data from correlated dimensions. The 
program is designed to analyze data which fit the multidimensional two- 



parameter logistic (M2PL) model (McKinley & Reckase, 1983a, Reckase, 1985b, 
1986). 

In a test requiring two ability dimensions, if a group of examinees had a 
normal distribution over the full range of the primary ability but a narrower 
range and lower mean on the secondary ability, how would this affect 
parameter estimates'^ McCauley and Mendoza (1985), in a study of 
identification of item bias, generated data for items which required a s2condary 
ability on which two groups of examinees differed in mean level However, the 
data were generated to conform to a specific factor structure and the analysis 
was done using a unidimensional model. Their results indicated that 
differential ability affected the estimates of difficulty moreso than 
discrimination. The results are not generalizable to multidimensional analysis 
of multidimensional data. 

It 15 unreasonable to assume abilities are uncorrected for most 
achievement tests. McKinley and Reckase (1984) considered the effects of 
analyzing data generated for correlated dimensions using MAXLOG The ability 
and Item estimates were confounded m the results of the data analysis. 
However, when the underlying abilities were correlated and a unidimensional 
analysis was used, again both unidimensional ability ana item parameter 
estimates were affected (McKinley & Reckase, 1984) 

Researchers who have used multidimensional analysis (e.g , McKinley, 
1983; McKinley k Reckase, 1983a, 1983b, 198a, Muraki Englehard, 1985) have 
indicated that a multidimensional model more adequately describes both real 
and simulated multidimensional data than does a unidimensional model 
However, in most cases, the simulation studies have been based on no 
replications so that stabihty of estimates is difficult to determine There is a 
need to know how consistently these estimates are recovered The effects of 
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both correlated abilities and differential secondary ability on parameter 
estimation need to be evaluated in a comprehensive, systematic manner 

Purpose of thg .Stu^y 

The purpose of this study was to determine the adequacy of 
multidimensional abihty and item parameter estimates using a MIRT analysis 
Specifically :hree questions were to be addressed 

ilj What IS the effect of correlated ability dimensions on parameter 
estimation for a two-parameter, two-dimensional IRT model? 

(2) What is the effect of differential abihty on the secondary ability on 
parameter estimation for the same model? 

(3) Are tne effects of correlated dimensions similar over the two 
distributions'? 

Methodology 

A Monte Carlo study was chosen to answer the research questions. 

Model Description 

The data for the study were generated to fit the multidimensional two- 
parameter logistic (M2PL) model (McKinley & Reckase. 1983a) which was 
updated by Reckase (1985a. 1985b, 1986). A description of the updated version 
follows 

The mathematical formula is given by Equation (i) 

exp (Aj'Sj + dj) 

P^j = P(Xij = 1 I Ai, di, fij) = , (1) 

1 ♦ exp (a^'fij + dj) 
(1^ 1,2, . . ,n, j= 1,2, ,N) 
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where P^i is the probability of a correct response to item i by examinee j, x^j is 
the response (I = correct, 0 = incorrect) of examinee j on item i, Ai is a vector of 
m discrimination parameters, is a parameter representing the difficulty or 
Item i, fij is a vector of m abihty parameters for individual j, N is the number of 
examinees, n is the number of items, and m is the number of dimensions 

This model is compensatory m that it allows high proficiency on one 
dimension to compensate for low proficiency on other dimensions in arriving at 
a correct response to a test item 

Reckase (1986) defined a multidimensional discrimination parameter for 
Item 1 to be 

m 0 5 

MDISCi " [ 2 (aik)2 ] (2) 
k=l 

This parameter is related to the item characteristic curve on the 

multidimensional item response surface above the line through the origin of the 

ability space and to the poinl of maximum information and is therefore 

analogous to the unidiFiensional discrimination parameter (Carlson, 1987) 

Reckase (1985b) also defined a multidimensional item difficulty 
parameter, MDIF^ such that 

m 0 5 

MDIFi = -di / [1 (aik)-] (3) 
k=l 

= -di / MDISCi 

This parameter represents the distance between the origin of the m- 
dimensional ability space and the point in the space where the item information 



IS a maximum The line joining this point to the origin is at an angle of to 
the k*^^ ability dimension where 



m 0 5 

cosaik = aii,/[ Z (an,)-] 



(4) 
k=l 



Program Hp^rnptlffn 
The program used to analyze the two-dimensional data ^ \s MIRTE 
(Carlson, 1987) While a version now exists to prc/lde estimates of item and 
ability parameters for a M3PL model, the version of the program used estimated 
parameters for the M2PL model As well as estimation of abil'ties, item 
discriminations, and item difficulty, MIRTE provides estimates of standard 
errors for each of these parameter estimates Estimates of the multidlmen'jional 
Item difficulty and d'scrimination are also provided The method of estimation 
used is a variation of the joint maximum likelihood procedure using a modified 
Newton-Raphson iteration technique and the algorithm used is similar to that 
used in the unidimensional analysis program, LOGIST (Wingersky, Barton, & 
Lord, 1982). The MIRTE (version 2 00) used in this study was found to estimate 
parameters when dimensions were correlated better than MAXLOG (J E 
Carlson, personal communication, December. 1987) While MIRTE has been used 
in one recent study (Ackerman, 1987) to estimate item parameters, the author 
did not investigate questions considered in this study 

Data Desrripthin 

Six different data sets were used The first three sets (Al, A2, A3) 
represented c^^es in which both underlying abilities Oi and 63) were normally 
distributed with mean 0, standard deviation 1 The difference among the three 
sets was the degree of correlation between the abilities, namely 0 00, 0 25, and 
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0 50 In the second group of data set^ Bl, B2, B3) the first ability was normally 
distributed (mean 0, standard deviation 1) but the second ability had a lower 
mean and standard deviation (-1 and 0 67 respectively) Again, there were the 
same three degrees of correlation between the two abilities 

The simulated test consisted of 104 items> 25 items requiring only the 
first abihty, 52 items requiring predominantly the first ability, and 26 items 
requiring equal amounts of both abilities A listing of the item parameters is 
provided in Table 1 Thirteen values of MDIF (ranging from -3 to +3 at intervals 
of 0 5) and two values of MDISC (2 00, 1 70) were chosen in order to cover the 
range ot difficulties and to simulate realistic d.scrimination conditions in which 
the Items were designed to discriminate well on the first ability To meet the 
requirement that the items discrirrinate well on the first ability, four values of 
the angle, a^p (0^, 15^, 30"^, 45*), were chosen The discrimination indices, ai 

and a2 (one for each dimension), were then generated to fit the corresponding d 
and MDISC The correlations between the original item parameters were- p(d,a^) 

= 0 004, p(d,a2) = -0 004, pvapa2^ = -0 736; and pfMDIF,MDISC; = -0 002. Because 

of the dependency of aj and a^i there is a larger correlation between these 

parameters. The same item parameters were used for each of the six data sets. 

Procedure 

The FORTRAN program M2PLGEN (Ackerman, 1985) was used to generate 
2000 ability vectors (61,62) satisfying the distributions of 9, and 62 for Data Set 

Al. M2PLGEN uses a random seed and the IMSL (1979) subroutine JGNSM to 
generate random abilities These ability vectors and the item parameters (a^ 
a2, d) were then used to genera^? response vectors (Os and Is) for each of the 
2000 simulees to each of the 104 items according to the M2PL model 
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Table 1. True Item Parameters f or the i04 w ^^i 



Of. I MDIFi di Item MDISC ac Item 



0' 


3,0 


-6 


I 


2 GO 


2 00 


0 00 


53 


: 70 


1 ^0 


0 00 


0» 


2 5 


-5 




2 00 


2 00 


0 00 


54 


1 "0 


1 70 


C 00 


0* 


2 0 


-4 


3 


2 00 


2 00 


0 00 


55 


1 ?0 


1 70 


0 00 


o» 


1 5 


-3 


4 


2 00 


o 00 


0 00 


56 


1 70 


1 70 


0 00 


o» 


1.0 




5 


: 00 


2 00 


0 00 


57 


1 70 


1 70 


0 00 


o» 


05 


-1 


6 


2 00 


2 00 


0 00 


58 


1 70 


1 70 


0 00 


0* 


0.0 


0 


/ 


2 '''^ 


2 00 


0 00 


59 


1 70 


1 70 


0 00 


0* 


-0 5 


1 


3 


2 00 


2 00 


0 00 


60 


1 70 


1 70 


0 00 


o» 


-1 0 


■> 


9 


: CO 


2 00 


0 00 


61 


1 70 


1 70 


0 00 


0* 


-1 5 


3 


10 


: CO 


: ,:o 


0 00 


62 


1 70 


1 70 


0 00 


0* 


-2 0 


4 


li 


2 00 


1 00 


0 00 


63 


1 70 


1 70 


0 00 


0* 


-2 5 


5 




2 00 


2 00 


0 00 


64 


1 70 


1 70 


0 00 


0* 


-3 0 


6 


13 


2 00 


2 00 


0 00 


65 


1 70 


1 70 


0,00 


15* 


3 0 


-6 


14 


2 00 


1 932 


0 518 


66 


1 70 


1 642 


0 44 


15* 


2 5 


-5 


15 


2 00 


1 932 


0 518 


6/ 


! 70 


1 642 


0 44 


15* 


2 0 


-4 


16 


2 00 


1 932 


0 518 


68 


1 70 


1 542 


0 44 


15* 


1 5 


-3 


1? 






0 518 


69 


1 70 


1 642 


0,44 


15* 


1 0 


-2 


13 


-> nr 




0 518 


70 


1 70 


1 642 


0 44 


15* 


0 5 


-I 


19 




1 93: 


0 51S 


71 


I 70 


1 642 


0 44 


15* 


0 0 


0 


20 


: 00 


1 93: 


0 518 


72 


1 70 


1 642 


0 44 


15* 


-0,5 


I 


21 




1 932 


0 518 


73 


1 70 


1 642 


0 44 


15* 


-1 0 


2 


22 


2 00 


1 93: 


0 518 


74 


1 70 


1 642 


0,44 


15* 


-1 5 


1 

J 


23 


: 00 






75 


1 "0 


1 642 


0,44 


15* 


-2 0 


4 


24 


L. 00 


1 93: 


0 518 


76 


1 70 


1 642 


0 44 


15* 


c 


5 


25 


2 00 




0 518 


77 


I 


1 542 


0 44 


15* 


-3 0 


6 


26 


i 00 


i 93: 


0 518 


78 


1 "0 


1 642 


0,44 


30* 


3 0 


-6 


27 


2 00 




1 00 


79 


1 70 


1 472 


0,85 


30* 


2 5 


-5 


28 


2 00 


1 -F T -) 


1 00 


30 


1 70 


1 472 


0 85 


30* 


2 0 


-4 


29 


2 00 




! 00 


81 


1 70 


1 472 


0 85 


30* 


I 5 


-3 


30 


2 00 


1 73: 


1 00 


82 


1 70 


1 472 


0 35 


30* 


1 0 




31 


2 00 




1 00 


33 




: 472 


0 35 


30* 


0 5 


-1 


32 


2 00 


1 732 


1 00 


34 


1 70 


1 472 


0 85 


30* 


0 0 


0 


33 


2 00 


1 732 


I CO 


85 


1 n I-,, 
1 / J 


i 472 


0 S5 


30* 


-0 5 


1 


34 


2 00 


1 732 


1 00 


36 


1 70 


: 472 


0 85 


30* 


-1 0 


2 


35 


2 00 


1 732 


1 00 


8" 


1 ''C 


: 472 


0 35 


30* 


-1 5 


3 


36 


2 00 


1 732 


1 00 


83 


1 70 


1 472 


0 85 


30* 


-2,0 


4 


37 


2 00 


1 732 


1 CO 


39 


1 70 


: 472 


0 85 


30* 


-2.5 


5 


38 


2 00 


1 732 


1 00 


90 


I 70 


1 47"^ 


0 85 


30* 


-3,0 


6 


39 


2 00 




1 00 


91 


1 ''O 


1,472 


0 85 


45* 


3,0 


-6 


40 


2 00 


1 414 


1 414 


92 


1 70 


! 202 


1 202 


45* 


2,5 


-5 


41 


2 00 


i 414 


1 414 


93 


1 7^^ 


1 202 


i 202 


45* 


2 0 


-4 


42 


2 00 


1 414 


1 414 


94 


1 -'o 


1 202 


1 202 


45* 


1 5 


-3 


43 


2 00 


1 414 


1 414 


95 


1 70 


1 202 


1 202 


45* 


1,0 


-2 


44 


2 00 


1 414 


1 414 


96 


1 70 


1 202 


1 202 


45* 


0,5 


-1 


45 


2 00 


1 414 


1 414 


97 


1 70 


1 202 


! 202 


45* 


0 0 


0 


46 


2 00 


1 414 


1 414 


98 


1 70 


1 202 


1 202 


45* 


-0.5 


1 


47 


2 00 


1 414 


1 414 


99 


1 70 


1 202 


1 :02 
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Table 1. (cont.) True Item Parametera for th.> 104 Items 

an MDIFi di item MDISC an ai2 Item MDISC an &i2 



45» 


-1,0 


2 


48 


2.0C 


1.414 


1.414 


100 


1 70 


1.202 


1.202 


45* 


-1,5 


3 


49 


2 00 


! 414 


1 414 


101 


1 70 


1 202 


1.202 


45' 


-2.0 


4 


50 


2 00 


1 414 


1.414 


102 


1 70 


1 202 


1 202 


45» 


-2.5 


5 


51 


2 00 


1 414 


1 4;4 


103 


1 70 


1 202 


1 202 


45* 


-3.0 


6 


52 


2 00 


14 


1.414 


104 


1 70 


1.202 


1 202 



The 2000 x 104 matrix of response vectors was analyzed using MIRTE to 
provide esilmates of 62, a^ a2, d, MDIF, MDISC, aj, and These results 

were filed, the random seed was incremented by two and the process was 

repeated. For Data Set Al there were 100 replications. Summary statistics 

were calculated on the 100 replications. 

This procedure was repeated for the other five data set conditions. The 
same initial item parameter estimates for aj and a2 were used for ev^ry 

'eplicaiion in order to provide better control :n the design. Finally, summary 
results from the six data sets were compared. 

Each job of 100 replications required approximately 45,000 to 50,000 CPU 
seconds. The Jobs were run in ' "h on an Amdahl 5880 processor with 64 
megabytes cf main memory. The VM/HPO operating system was in use. 

fiesults an 1 Discussion 

The purpose of this research was to determine the effects of correlated 
abilities and differencial ability on one dimension on parameter estimation given 
a two-dimensional, two-parameter logistic item response model. First it should 
be determined if suitable ability data wer» generated to model the conditions 
specified, "^hen it needs to be determined wnether MIRTE adequately estimated 
the parameters from the analysts of the response vectors generated. Results are 




iiscussed m Part 1 :or Data Sets Al, A2 and A3, in Part 2 for Data Sets Bl, B2 
and B3 and in Part 3 for comparisons made among the A and B data sets The 
statistics given in this section are the mean values of the corresponding 
statistics determined for each of the 100 replications m each data set 

Generation of (Q^.B^) The aoility data in all three data sets were 
generated to fit the specifications stated, The correlation between 61 and 63 for 
data generated over the 100 replications was recovered as -0 00 1 for Data Set 
Al> 0.251 for A2, and 0.500 for A3, The means for 61 and ^^^^ ^^e range 
0 002 to -0,004 and standard deviations were within l± 0 003 There was very 
small variance (less than 0 0005) for these means and standard deviations in all 
data sets There were no replications in which the ability data were not 
satisfactorily generated 

In k.eeph*g with the findings of Greaud (l988), the mean raw score 
appeared to be unaffected by changes in degree of correlation between the 
ability dimensions (All ra^/ score means were approximately 52 ) 

RecQverv of Ability Parameters In each of the three data sets over the 

A A 

100 replications, 61 and 62 had means of 0.00 and standard deviations of 1 00 
The standard deviation of the mean was less than 0 001 for all data sets The 
recovery of these statistics is not particularly meaningful as a measure of 
accuracy in these cases because the MIRTE program rescaies the theta estimates 
to mean 0, standard deviation 1 after each iteration m order to prevent drifting 
of the estimates. 

In the data analysis, the program doesn't always identify dimensions one 
and two correctly In order to avoid confusing the dimensions during the 100 
replications, a check was made during each daU analysis on the first thirteen 
Item discrimination parameter estimates (These items were pure on ; If the 
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sum of the first thirteen ai estimates was loss than the sum of the first 
thirteen estimates, the estimations for the dimensions were fhpped. 

The mean average absolute deviation of from the true lAADie])) 
ranged from 0.446 to 0.459 (see Table 2) (Note that the tables appearing in the 
text contain results for all six data sets in order to save space and so that 
comparisons can be seen more readily ) Increasing p(6i,62^ did not appear to 
affect this. The mean average absolute deviation of 62 (AAD(e2)) ranged from 
0,544 to 0 4 12 and seemed to be more affected by the correlation between the 
abilities. As p(9i,e2) increased, the AAD(e2) decreased. This is probably 

because of the compensatory nature of the M2PL model. There was very little 
variance over rephcations in these AADs (0.001 for 61, 0.002 for 62^ so that the 

thetas appear to have been recovered consistently across the three data sets. 

Tabic 2. Mean Values of Statistics for Estima ted Thetas (over IQO reputations) 
Data ^ 

Set 0(61,62) AADOi) AAD(§2i r(ei,e2) rOj^^i) r(e2,e2) r(ei,e2) r(e2,^i) 



Al 


0.00 


0 447 


0 544 


0,062 


0 842 


0 764 


0 505 


-0 295 


A2 


0 25 


0 446 


0 470 


0 179 


0 842 


0 824 


0.603 


-0 050 


A3 


0 50 


0 459 


0 412 


0 282 


0 831 


0 865 


0.699 


0 209 


Bl 


0 00 


0 463 


0 856 


0 147 


0 773 


0.517 


0 662 


-0 170 


B2 


0 25 


0 544 


1 079 


0 201 


0 765 


0 623 


0 713 


0 052 


B3 


0 50 


0 566 


1.047 


0 218 


0 744 


0 721 


0 755 


0 247 



The relationship between the ability parameter Q\ and its estimate was 
adequately recovered as rOj^j) was greater than 0.83 for all three data sets 
In Data Set A3, 62 appeared to be recovered better than in spite of the fact 
that few Items were measuring the e2-space. This was also supported by the 
decreasing AAD(e2) as the correlation between the ability dimensions increased 
As pOj.eo) increased, 61 was less well recovered but 62 was better recovered. 
This was supported by the mean correlation between 62 and As pCej .62) 
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increased. 62 became moi^ highly corrolated with 62 (Table 2) In all three data 
5et5, 62 "^"^^ recovered ihirly 7,-eli according to r(e2.92^ 

The mean standard error of the thetas (as calculated by MIRTE) was 
approximately 0.259, almost half .he size of the AADs The variance in these 
mean standard errors was very small although the standard errors were more 
spread out as the correlation between the dimensions increased 

The correlation between the aoility dimensions was not well recovered 
AS p(ei,e2) increased, MIRTE tended to produce ability estimates which were 
less correlated than the generates abilities. The difference between p(ei.e2) and 

A A 

r(6i,e2) increased as pC^i,^^; increased. This result agrees with that reported 
by Carlton (1987). 

Recovery of Item Parameters In the maximum likelihood estimation 
procedures used in MIRTE, ability estimates are used to improve item parameter 
estimates and vice versa Hence, the final estimates are affected by each other 
As 0(61,62) increased, what happened to the item parameter estimates? 

Statistics on the item difficulty parameters are summarized in Table 3 
In all three data sets, r(d,d) = G 997 indicating good recovery of the relationship 
between the item difficulty parameter and estimate. As Q^Qi^QZ' increased, the 
mean and standard deviation cf d were increasingly overestimated but remamea 
close to the original parameter sutlstlcs The AAD(d) increased slightly as the 
correlation between the abiU^^? dir-onslons increased indicating that d was being 
less well recovered. However, uie standard error of d decreased as p(e|,02'' 

increased. The mean and standard deviation of the multidimensional difficulty 
parameter, MDIF, were recovered well although here again MDIF was less well 
recovered as p(6i,02> increased MDIF is a function of the discrimination 
parameters and its estimate is therefore affected by the estimates of the a^ 
parameters 
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Table 3. Summary of Mpan Stati^.tics for TtPm Difficulty (ovgr !00 replicationr-! 

Data ^ ^ ^ ^ ^ ^ 

Set d 9<d) de(d) AAD(d) MDIF 5(MDIF) r(d,d) r(MDIF,MDIF) 



True 


0.009 


3.771 






-0 005 


2.058 






Al 


0.009 


3.929 


0 112 


0 224 


0 006 


2 079 


0 997 


0 995 


A2 


0 010 


3 936 


0 109 


0 228 


0 005 


2 028 


0 997 


0 994 


A3 


0.030 


3.936 


0 106 


0 232 


0 012 


1.999 


0 997 


0 391 


Bl 


-0.726 


3.995 


0,137 


0 811 


0 460 


2 535 


0 984 


0 958 


B2 


-0.734 


4.001 


0.132 


0 827 


0,434 


2 459 


0 982 


0 956 


B3 


-0.716 


4 044 


0 123 


0 834 


0 397 


2 247 


0 982 


0 969 



s - standard deviation; se - standard error from MIRTE program 



Discrimination parameter estimates have been reported to be affected 
more by multidimensional data This result was also evident in this study. The 
mean of a^ was lower than the true mean and the standard deviation was 

higH .r than the true standard deviation for all three data sets (see Table 4). 
The mean of $2 was much higher than the true mean of 0 678 In fact the mean 
of a2 was higher than the mean estimates of aj and approached the true mean 
of aj as p(6i,e2) Increased. Both means Increased slightly as p(ei,62) increased 
The standard deviation of a2 was higher than the true standard deviation but 
there was not as large a difference here as with aj Standard errors of 
estimation of aj and a2 were approximately 0 09 but the AADs were much 
larger, particularly for a2 As the correlation between the two abihty 
dimensions increased, the AAD(a2) increased slightly indicating a2 was being 
less well recovered The AAD(ai) was approximately 0 5 for all three data sets 

The standard errors of both the discrimination parameter estimates were 
similar In site but se(a2) ^ seCap. 

The multidimensional discrimination parameter, MDISC, was recovered 
with a higher mean and higher standard deviation in all three data sets There 
appears to be a rotational indeterminacy in the recovery of the discrimination 
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parameters and a tendency to spread the discrimination parameter estimate* 
over the entire space even though they originally did not cover the entire space 



Data 




5(ai) 














s(MDISC) 




Set 


A 




AAD(a 1 ) 


A 


s(a2) 


se(a2) AAD(S2) MDISC 


A 

^1 


True 


I 637 


0 251 






0 678 


0 496 




1 850 


0 151 


50 


Al 


I 195 


0 569 


0 099 


0 500 


1 379 


0 512 


0 096 0 707 


1 957 


0 288 


49 07 


A2 


I 201 


0 528 


0 095 


0 486 


1 448 


0 582 


0,094 0,775 


2 013 


0 319 


49 40 


A3 


1 202 


0,502 


0 093 


0 490 


1 510 


0 628 


0 093 0 836 


2 057 


0 381 


49 98 


Bl 


1 076 


0,551 


0,119 


0 623 


1 228 


0 449 


0 112 0 653 


1 736 


0 398 


49 09 


B2 


1 094 


0,557 


0 138 


0 620 


I 298 


0 501 


0 108 0 708 


1 803 


0 449 


49 76 


B3 


1 139 


0 599 


0 134 


0 624 


I 408 


0 534 


0 103 0,791 


1 922 


0 495 


50 94 



s - standard deviation, se - standard error from MIRTE program 

This was supported by the statistics on the angle estimates, aj and 
Originally aj had a mean of 22 50^ This was recovered in all data sets at over 
49* Similarly, whose original mean was 67.50**, was recovered in all data 
sets at just over 40'' The original standard deviation cf increased for the 

estimates to approximately 20** There seemed to be an attempt to cover the 
entire 6162-space in estimation of parameters related to discrimination 
Estimates of aj and a2 ranged from very close to to almost 90'' 

Correlation coefficients again were used to determine adequacy of the 
parameter recovery (Table 5). In all cases, a^ correlated more highly with aj 
than with a2. Similarly, a2 correlated more highly with a2 than it did with ai 
As well, a2 correlated higher with a2 than aj did with a2 The anomaly m the 
correlations was that ai correlated less highly with a^ than a2 did with a] As 
the discrimination parameter estimates appear to be dispersed across the 6162- 
space* this may account for the apparent better recovery of a2 than or aj That 
the standard deviation of a2 was twice as large as that of a^ may also account 
for the higher correlations of both a\ and $2 with a2 The greater variability in 
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a2 would allow for higher correlations. The AAD(a2) did not support the 
conclusion that a2 was better recovered than aj 

The correlation between ai and a2 was slightly stronger than the true 
parameter correlation of -0 738 except m the Data Set A3 where it was slightly 
smaller. The multidimensional discrimination parameter, MDISC, did not 
coi relate as highly with its estimate This correlation was highest (0.600) when 
the ability dimensions were uncorrected and decreased as the correlation 
between the abilities increased 

Table 5. Mean Correlations for Item Discrim ination Values (over 100 replications) 



Set r(ai,ai) r(a2,a2) r(ai,a2) r(ai,a2) r(a2,ai) r(MDISC,MDTsC) r(ai,ai) 



Al 


0.834 


0.893 


-0 765 


-0.572 


-0.865 


0 600 


0 943 


A2 


0 818 


0 899 


-0 769 


-0.587 


-0.830 


0.565 


0 933 


A3 


0.760 


0 895 


-0 735 


-0 587 


-0 747 


0 502 


0 907 


Bl 


0.530 


0 523 


-0 428 


-0 309 


-0,543 


0 296 


0 630 


B2 


0 460 


0 511 


-0 401 


-0 306 


-0 455 


0.269 


0 586 


B3 


0 431 


0 514 


-0 459 


-0 285 


-0 403 


0 285 


0 564 



Carlson (1987) reported that estimates of the discrimination parameters 
are sensitive to the distribution of the discrimination parameters in the 
generated data, The aj parameters were not distributed over the entire latent 

space. This restricted then the recovery of these parameters which, m turn, 
affected the recovery of the ability and difficultyparameters That the items of 
the simulated test did not cover the entire latent sjJIce and the variabihty m a2 
was so much greater than m ai would both affect recovery of parameters in a 
detrimental way. 

Generation of (61.82): Again the abUity data m the three B data sets were 
generated to fit the specifications stated. The correlation between 61 and 62 for 
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data generated over the lOO replications was recovered as -0 GOl tor Data Set 
Bl. 0.251 for B2, and 0 499 for B3 The means for Si were m the range of -0 GC2 
to -0.004 with a standard deviation range of I 000 to 1 003. for 6^ the means 
were in the range -0 999 to -i 00 1 with a standard deviation range of 0 669 to 
0.671 Again there was very small variance (less than 0 0005) for these means 
and standard deviations in all data sets There were no replications in which 
the ability data were not satisfactorily generated 

The raw score on the test was affected by the differentiated ability on 62, 
The raw score means were about 5 points lower at approximately 47 Increasing 
the correlation batween the ability dimensions did not appear to affect the raw 
score mean. 

Recovery of Ability P^r^m^\^xfi In each of the three data sets over the 
ICQ replications, and 62 had means of 0 00 and standard deviations of 1.00. 
As the MIRTE program rescales the theta estimates to mean 0, standard 
deviation 1 after each iteration, these estimates cannot be meaningfully 
compared to the means and standard deviations of the generated parameters 

The AAD(§i) ranged from 0 463 to 0 566 (see Table 2 above) As p(ei,92) 
increased, AADC^j) increased. The values of AAD(62) ranged from 0 856 to 1 047 
and changes in p(ei,e2) didn't produce predictable changes in this statistic. The 
rescaling of 62 is reflected in the values of AAD(e2) The variance in these 
statistics over the replications was very small as in the A data sets The mean 
standard error of the theta estimates in the B data sets was 0 237, larger than 
that in the A data sets. 

The recovery of the relationship between the parameter and its estimates 
was not as high as in the A data sets for either 6 1 or 62 The relationship 
between and its estimate was greater than 0.74, between 62 and its estimate 
greater than 0 52 (see Table 2 above) As p(ei,e2) increased, the r(6i,§2) also 
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increased. Increasing the correlation between the dimensions had the same 
effect in the B data sets as in the A data sets, i e , appeared to be less well 
recovered and the recovery of 63 improved. 

Tht correlation between the ability dimensions was not well recovered 
A3 in the A data sets, MIRTE produced ability estimates which were less 
correlated than the generated abilities when p(9i,92) * 0. 

Neither 9] nor 92 was lecovered as well in the B data sets as in the A 
data sets. It seemed to be more difficult for the program to distinguish between 
the dimensions and there was a greater tendency to collapse the space. 

Recovery of Item Parameters Statistics on the item difficulty parameters 
are provided in Table 3 (above) Both d and MDIF were less well recovered in 
the B data sets than in the corresponding A data sets. The rescaling of 92 to 

mean 0, standard deviation 1 m the MIRTE program made the estimates of the 
theta vectors for the "sample " m the B data sets appear more able than the 
original theta vectors would indicate This resulted in the items appearing to be 
more difficult than they were Th re were larger standard errors for d and 

A A 

larger standard deviations for d and MDIF than in the A data sets. As p(9i,92) 
Increased, se(d) decreased but AAD(d) increased The AAD(d) were much larger 
than In the A data sets. The mean of MDIF was greatly overestimated 
However, the recovery of the relationship between the parameter and its 
estimate remained high. The correlation r(d,d) > 0.98 and r(MDIF,MDIF) > 0 95 
and there was little change in these correlations as p(9i,92) increased. 

The estimates of the discrimination parameters were similar to those for 
the A data sets (Table 4 above). The parameter aj was underestimated, a2 was 
overestimated, and the mean estimate of a2 was always larger than that of aj 
The se(a2) < se(ai) but se(a2) decreased as p(9i,92) increased and 3e(ai) 
increased, As p(9i,92) increased, the means of both a^ and a2 increased and the 
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standard deviation? both incrrAfed Th.rre were lauer AAD.ai than m the A 
d^iia sets but smaller AAD^a^' '.Vhilr the statistics foi *he iiscrim.nation 
parameters were more sim;;ar m the A and B data sets than those tor the ab'ir 
or difficulty parameters, they were also more distorted m that tne estimate? 
were less hke the true vai-^cs m aii lases 

The parameter MDISC was cverestimated only in the B3 data set The 
mean of this parameter was tetter rt^vered as p(ei,e2) increased but the 
standard deviation was increaf mgiv iverestimaied and was net as weii 
recovered as m the A data fets Resu.ts for the angle recovery were similar to 
those found in the A data sets Tr.ere again seemed to be an attempt to cover 
the entire space in estimation or Lctraineters related to discrimination, 

Correlation coefficients -^'ere j^^^i ,5ee Table 5 abovey to determine 
adequacy of the parameter rec:verv The correlation between at and a-- was 
greatly reduced but char.^e^ . e^.e-j did not have an affect on this The 
correlations between the pa> arneter arvi its estimate w»^re much lower than in 
the corresponding A data set: l:r ^.-/h a; and a^ The islationship between 
multidimensional discrimination paran-.eter MDISC and rs estimate was also 
reduced Differentiated abil.v/ -.n 62 did affe:t recovery cf the discrimination 
parameters 

Interact ion Effects of Correlated A ' nhties and a Differennated >.hilitY 
There were four possible interaction effects, the first m the recovery of o^Sj^e^' 
There was an interaction bet-^een correlation of abilities and iifferentiated 
ability on 63 on the estimated correlation of abilities For the B data sets, \here 
was little effect of correlated abilities For the A data sets, much steeper slopes 
resulted when rCS^^So^ plotted against oi^i^B-) 'Figure l: There was a 
poorer recoveiy of pi^i^Q;^; in the B data set: with the exception of 5: The 
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difference m r(ei,e2) was imall between data set A2 and B2 and perhaps is not 

as meaningful Indeed this may not be a true interaction even though the lines 
cross as the B data sets consistently appear to recover pOj ,63) less well. The 
slightly better recovery of the p(ei,e2) of 0 25 may in fact be an artifact of a 
regression hne showing no relationship and consistently estimating correlation 
close to 0.25 regardless of the true correlation. One would expect p(ei ,62) to be 
better recovered in the full distribution of 63 at any level of correlation. 

3.00e-l - 
^ 2 00e-t- 

<• 

lOOe-l- 
c 

1 36e-20 ! ' 1 — ^ — ^ — ^ — \ — ■ — I — ^ — I — ^ — I 

00 0) 02 03 04 05 0 6 

p(ei,e2) 

Figure 1 . The relationship between pCei^e?) and r(ei,e2) for the six data 
sets. 

A second interaction occurred between correlation of abilities and 
differentiated ability on the correlation of each ability estimate with its 
parameter. Asp(ei,e2) increased, r(ei,6i) decreased while r(e2,e2) increased. 
Increased p(ei,e2) had more adverse affects on r(ei,ei) than ^62,62) The 
appeared to depend more on 61 ability (i.e., r(ei,e2) increased as p(ei,e2) 
increased and was larger in the B data sets n in the A data sets) This was 

A 

not the case with which did not seem to depend on 62 in either A or B data 




-qA Data Sets 
-^B Data Sets 
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sets The eie2-space wculd appear to be collapsing The distribution of the 
discrimination parame!e:s may be contributing to thii result as much as 
differentiated ability and correlation between abilities 

A third interaction was found as sc%) and 5(£j) were affected by 
correlation of abilities and differentiated ability on 6^, As p(ei,e2) increased, 
the s(ai) decreased in the A data sets but increased in the B data sets, whereas 
s(a2) increased in both the A and B data sets In the A data sets, the se(ai) 
decreased as p(ei,e2) increased but increased in the B data sets. The se(a2) 
decreased as p(ei,e2) increased in both A and B data sets. Increasing the degree 
of correlation between abilities and a differentiated 63 ability combine to give 
poorer recovery of aj While u would be expected that the recovery may 
deteriorate in P data sets, :t was not expected that increasing p(6i,e2^ would 
cause further deterioration As the abilities became more correlated, more 
information is being used to estimate the second dimension (^^e(a2) < seCap and 
se(a2) decreases as p(ei,e2) increases) As well, the AADsa^) both increased as 
correlation increased These results might be related to the recovery of the 
mean of a2 as being larger than the mean of aj and to the possible collapsing of 

the space. Clearly, the B samples didn't cover the ability space adequately The 
rescahng of the 62 may be contributing to this interaction 

A fourth interaction was found between the correlation or abilities and 
differentiated ability on 62 affecting the mean of MDIF Surprisinglv, in the A 
data sets> as (>kQ\.^j2^ increased, MDIF changed very little. In the 3 data sets, as 
p(ej,62) Increased, MdFf decreased (the items appear to be getting easier) This 
was as expected. Since MDIF Is a function of d and MDISC> and a- (a part of 
MDISC) was better estimated in the B data sets, this may explain why MDIF 
became smaller (indicating easier items) but d did not change A differentiated 
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ability on 6^ affected the sice of the difficulty means raoreso than the degree of 
correlation. 

Conclusions 

This research study was designed to determine how well 
multidimensional IRT abihty and it«m para eters would be estimated under 
certain specified conditions The conditions were different degrees of correlation 
between the two abiuty dimensions and a differentiated ability on a second 
dimension. 

The results of the research indicated that as the ability dimensions 
became more correlated, there was a tendency for the two-dimensional ability 
space to collapse. MIRTE tended to underestimate the degree of correlation 
between the abihty dimensions but did not force orthogonality on the 
dimensions Of the item parameters, the difficulty parameter was recovered 
most successfully As the abihty dimensions became more highly correlated, the 
discrimination parameter estimate for the predominant dimension (aj) was 
underestimated while discrimination on the second dimension (a2) was 
overestimated. The discrimination parameters m general were not well 
recovered. Increasing the correlation between the ability dimensions tended to 
result in even poorer recovery of the discrimination parameters. For correlated 
dimensions the effects of item structure and abihty structure were compounded 
as found by McKinley and Reckase (1984). The discrimination parameters did 
not cover the latent space adequately In the recovery there was a tendency to 
spread the discrimination parameters over the entire latent space. This also 
occurred with the ability estimates and would indicate some rotational 
Indeterminacy in the recovery of the multidimensional correlated latent space. 
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Restnction^ on tne second .Utility dimension resulted m poorer estimation 
for parameters of both ability dimensions The differentiated ability on 8^ 
appeared to cause a large shift in the estimates of d, underesti-natmg the mean 
but retaining the internal structure of the item difficulties The restrictions on 
the second ability dimension made the recovery of the discrimination 
parameters much worse than in the A data sets. The rescalmg of the 62 
estimates clearly affected the parameter recovery for the B data sets, 
particularly item difficulty 

Four interaction effects of correlation of abilities and a differentiated 
ability on 62 were noted The correlation of abilities and differentiated ability 
on 62 affected recovery of p(ei,e2), recovery of the rOi.^j), the discrimination 
parameters (in s(ai), se(ai), and AAD(ai)), and the mean of the estimate of MDIF. 
The rescaling of So and the poor coverage of the ability space and the item space 
partially explained these effects 

As for the analogy of the ESL stu '.ents, would these students be penalized 
m placement based on the results of this test? Clearly their raw scores on the 
tests were lower As the ability dimensions became more correlated, the raw 
score for these students improved only slightly McKinley and Reckase (1984) 
reported that pie 1,62) was an important factor m the latent ability structure 
In terms of the recovery of the primary ability dimension, 6], . the ESL students 
portrayed m the B data sets would have poorer recovery of this dimension as 
indicated by rO^^i) and AAD(ei), If the M2PL model were chosen to represent 
the response data and MIRTE were used to analyze the data, these students 
would probably be penalized if their Bi estimates were used to determine 
placement. However, because of the rescaling, the question of how the abihty 
estimates of the ESL students are affected cannot really be determined. If the A 




and B data sets had been pooled together, it would have mirrored a more 
realistic educational situation 

There are three issues of concern identified in this research the 
problems caused by the rescalmg of the 6^ estimates, the recovery of the iw^- 

dimensional space, and the dimensionality of the items 

The rescaling of the 60 estimates in the B data sets affected estimates of 

difficulty as well as estimates ot thetas and discriminations. The estimates of 
iTieans of d and MDIF were adversely affected in the B data sets However, 
correlations between the parameters anc che corresponding estimates were good. 
The estimates • . the mean of a2 improved in the B data sets It cannot be 

determined from the results reported here the extent of the effects of rescaling 
but It appears that the rescaling problem affects all parameter estimates 
somewhat. 

The recovery of the structure of the ability space is also a co icern. 
There was a tendency for the space to colbpse as the abilities became more 
correlated. This may relate to a rotational indeterminacy m the recovery of the 
abilities, in the initial research design, some item , pure on tn* second 
dimension were included m order to anchor the abilities in an attempt to 
improve the recovery of all p:. ^meters cince such a test would not simulate 
the desired condition, this decision was not made This might be reconsidered in 
a future design. The collapsing of the space as 0(61,62) increased not only 

affected the theta estimates but also the discrimination estimates In the B data 
sets, the structure of the latent space was recovered less well than in the A 
data sets. In retrospect, combining corresponding A and B data sets prior to 
analysis of the raw score vectors would provide a sample which more typically 
represents the situation in which ESi- students would likely be placed and 
would have allowed for better coverage of the 6 162 -space This might improve 
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the estimation or some parameters and it would also eliminate the rescal-ng 
problem. 

The third issue is the dimensionality of the item space Twenty-su of 
. the Items were unldimensional (pu:? on aj) The remaining 78 were two- 
dimensional, 52 requiring more ability on the first dimension for a correct 
response, 26 requiring equal amounts of both abihties The latent structure of 
the data was more ccmplex than a two-dimensional test composed of two sets of 
unldimensional items There were serious concerns with respect to the recovery 
of the Item space, the most serious being the apparent dominance of ao over aj, 
or a2 over aj . The poor recovery of the discrimination parameters also affected 
recovery of the difficulty and ability parameters. The item space seemed to 
become somewhat unldimensional The estimates of the ajS were more alike and 
the size of the aj angles moved towards 45' with a2 becoming dominant, Since 
the range of a2 was greater than that of ap this could have affected the 
dominance of a2 over aj 

Interpretation of parameter estimates appears to depend on the model, 
P<©1 .©2^ and the characteristics of the data set There is every indication from 
the results of this research that there are indeed three components of 
multidimensionahty (subject dimensionality, test dimensionality, and the 
interaction of the two) as suggested by McKinley and Reckase (1984) Although 
the population may be multidimensional, if the test is largely unidi--iensionai, 
resulting scores may tend to unidimensionaluy as well It may be expecting too 
much of the model and MIRTE to have better recovery of the parameters relating 
to the second dimension when few items measured that diir.eneion and when the 
populations in the B data sets were low on ability in the second dimension 

Several questions remain at the conclusion of this research which suggest 
future studies. These are summarized briefly, 
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Are the results affected by ine estimation procedures and/or the model 
chosen? Replication of the research asmg different models (perhaps the M3PL 
rri^iel of Began and Yen (1983' or a noncompensatory model) would indicate to 
what extent model choice aifected results. Inclusion of a guessing parameter m 
the model would provide additional mtormation A more recent /ersion of 
MIRTE allows for inclusion ot the c-parameter 

It would be useful as well to estimate item parameters only while holding 
the given ability par^imeters nxed and vice versa to determine further the 
efficiency of the MIRTE program These results could be compared with those 
obtained when item and ability parameters are simultaneously estimated 
Presumably both item ana aoihty parameters would be better estimated. 
However, one could study the erfects cf each oy varying the other parameters, 
i.e , specifying different conditions for item parameters in order to determine the 
effects on the ability estimates and vice versa, 

Corresponding A and B data sets could be combined m order to present the 
ESL-type group in a large sample of wider variability more typical of a real life 
situation This should solve some cf tne rescalmg and space problems 

The test design might be altered to allow for better distribution of the 
discrimination parameters. The discrimination and difficulty parameters might 
be randomly generated to cover the space The test would then not simulate the 
condition that it primarily measure one of the two dimensions However, 
valuable information might be gained on parameter recovery 

It would be useful to aetermme how well the ability dimensions were 
recovered at different cbility levels rather than just at the mean level cf ability 
although the standard errors, average absolute deviations, and correlations do 
give some indication of overall recovery This could be ascertained by looking at 
the ©-vectors in different sections of the 9ie2-space and comparing the original 
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(61,63) with Its estimate It would also be useful to know how influential the 
second ability dimension became as the items required more of this ability for a 
orrect response. 

Another area of interest is that of item difficulty Further analysis of 
the examinee results on easy versus difficult items at different abihty levels 
would provide useful information for test builders, 

A tesi with a wider range of discrimination values could determine how 
discrimination values affect recovery of item and aoility parameters Analysis 
of discrimination parameter recovery in different areas of the ability space 
could also be useful. Providing more items requiring both dimensions and some 
Items pure on both dimensions would provide some indication of how the 
discrimination values need to be chosen to improve estimates. The poor 
recovery of the discrimination parameters is a cause for concern. 

This research study provides encouraging results for those working in 
multidimensional item response theory An important finding is the capability 
of MIRTE to retdin the structure of the data and the people. Although there was 
some tendency to collapse the latent space as 0(61,62) increased, estimates 
provided by MIRTE recovered two dimensions It would be ludicious to further 
develop estimation programs so that rotational solutions could be produced 
which might alleviate the tendency to collapse a o-dimensional space as the 
correlation between the dimensions increases 
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